Skip to content

Add simclr eval#13

Open
sanaAyrml wants to merge 59 commits intomainfrom
add_simclr_eval
Open

Add simclr eval#13
sanaAyrml wants to merge 59 commits intomainfrom
add_simclr_eval

Conversation

@sanaAyrml
Copy link
Contributor

@sanaAyrml sanaAyrml commented Feb 1, 2024

PR Type

[Feature ]

Short Description

I added evaluation module in this new PR which adds a classification head to pre-trained backbone. Then it freezes backbone and only train classification head and report accuracy over train and test data.


This change is Reviewable

@afkanpour
Copy link
Collaborator

It looks like this PR includes changes unrelated to linear eval (perhaps from multi-gpu support?). Can you please fix that?

@sanaAyrml
Copy link
Contributor Author

It looks like this PR includes changes unrelated to linear eval (perhaps from multi-gpu support?). Can you please fix that?

I have put this pr before merging ICGAN, maybe that is the reason to have extra changes. I will update it today and put it up.

@sanaAyrml sanaAyrml marked this pull request as draft February 12, 2024 17:50
size (int): Image size.
"""
transform_list = [
transforms.Resize(size=(size,size)),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For inference I believe SimCLR performs a random center crop before resizing. Can you please have a look at either their paper or the code to verify?

Copy link
Contributor Author

@sanaAyrml sanaAyrml Feb 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the code they didn't have any transformation rather than .ToTensor, but I thought this would fail on imagenet as we have various images sizes. However in the paper they mentioned "As preprocessing, all images were resized to 224 pixels along the shorter side using bicubic resampling, after which we took a 224 × 224 center crop." It is bit vague for me, should we first resize on smaller dimension and then centre crop? or should we first crop on smaller dimension and then resize?

@sanaAyrml sanaAyrml marked this pull request as ready for review February 20, 2024 18:53
data_utils.CenterCropLongEdge(),
transforms.Resize((size, size)),
transforms.ToTensor(),
transforms.Normalize(self.config.norm_mean, self.config.norm_std),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you remove this? Is it causing an error?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this implementation should be available. Do you need additional capability beyond this? https://github.com/VectorInstitute/GenerativeSSL/blob/main/icgan/data_utils/utils.py#L29

@@ -0,0 +1,51 @@
#!/bin/bash

#SBATCH --job-name=train_sunrgbd
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added changes to run_simCLR.py in my PR that takes checkpoint_dir and creates a directory based on the time of running the training job. So let's revert these changes.

from torch import nn
from torchvision import models

from ..exceptions.exceptions import InvalidBackboneError
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use absolute paths.

self.device_id = kwargs["device_id"]
self.writer = SummaryWriter()
# Create a directory to save the model checkpoints and logs
now = datetime.now()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added logic to run_simCLR.py that does something like this. So you have to either remove these parts, or remove my changes before merging.

#SBATCH --nodes=1
#SBATCH --gres=gpu:4
#SBATCH --ntasks-per-node=4
#SBATCH --ntasks-per-node=1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep in mind that this uses a single GPU per node.

# srun” executes the script <ntasks-per-node * nodes> times
srun python run_simCLR.py \
# srun execute ntasks-per-node * nodes times
srun pythong run_simCLR.py \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pythong?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the github repo from which we borrowed simclr implementation provides pretrained model checkpoints, can you please download one of them and test this script on it to see if the results match with what's reported by them?

Copy link
Collaborator

@afkanpour afkanpour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this? If not, should we delete it?

Reviewable status: 0 of 19 files reviewed, 30 unresolved discussions (waiting on @sanaAyrml and @vahid0001)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants